On the Complexity of Generating Maximal Frequent and Minimal Infrequent Sets

نویسندگان

  • Endre Boros
  • Vladimir Gurvich
  • Leonid Khachiyan
  • Kazuhisa Makino
چکیده

Let A be anm×n binary matrix, t ∈ {1, . . . ,m} be a threshold, and ε > 0 be a positive parameter. We show that given a family of O(n) maximal t-frequent column sets for A, it is NP-complete to decide whether A has any further maximal t-frequent sets, or not, even when the number of such additional maximal t-frequent column sets may be exponentially large. In contrast, all minimal t-infrequent sets of columns of A can be enumerated in incremental quasi-polynomial time. The proof of the latter result follows from the inequality α ≤ (m − t + 1)β, where α and β are respectively the numbers of all maximal t-frequent and all minimal t-infrequent sets of columns of the matrix A. We also discuss the complexity of generating all closed t-frequent column sets for a given binary matrix.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding All Minimal Infrequent Multi-dimensional Intervals

Let D be a database of transactions on n attributes, where each attribute specifies a (possibly empty) real closed interval I = [a, b] ⊆ R. Given an integer threshold t, a multi-dimensional interval I = ([a1, b1], . . . , [an, bn]) is called t-frequent, if (every component interval of) I is contained in (the corresponding component of) at least t transactions of D and otherwise, I is said to be...

متن کامل

Some results on maximal open sets

In this paper, the notion of maximal m-open set is introduced and itsproperties are investigated. Some results about existence of maximal m-open setsare given. Moreover, the relations between maximal m-open sets in an m-spaceand maximal open sets in the corresponding generated topology are considered.Our results are supported by examples and counterexamples.

متن کامل

Generating Dual - Bounded Hypergraphs 1 by

This paper surveys some recent results on the generation of implicitly given hypergraphs and their applications in Boolean and integer programming, data mining, reliability theory, and combinatorics. Given a monotone property π over the subsets of a finite set V , we consider the problem of incrementally generating the family Fπ of all minimal subsets satisfying property π, when π is given by a...

متن کامل

Separating Structure from Interestingness

Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented. In this paper we propose a general approach to build condensed repr...

متن کامل

An Algorithm for Mining Maximum Frequent Itemsets Using Data-sets Condensing and Intersection Pruning

Discovering maximal frequent itemset is a key issue in data mining; the Apriori-like algorithms use candidate itemsets generating/testing method, but this approach is highly time-consuming. To look for an algorithm that can avoid the generating of vast volume of candidate itemsets, nor the generating of frequent pattern tree, DCIP algorithm uses data-set condensing and intersection pruning to f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002